Search CORE

49 research outputs found

Knowledge-Augmented Language Model and its Application to Unsupervised Named-Entity Recognition

Author: Du Jingfei
Liu Angli
Stoyanov Veselin
Publication venue
Publication date: 01/01/2019
Field of study

Traditional language models are unable to efficiently model entity names observed in text. All but the most popular named entities appear infrequently in text providing insufficient context. Recent efforts have recognized that context can be generalized between entity names that share the same type (e.g., \emph{person} or \emph{location}) and have equipped language models with access to an external knowledge base (KB). Our Knowledge-Augmented Language Model (KALM) continues this line of work by augmenting a traditional model with a KB. Unlike previous methods, however, we train with an end-to-end predictive objective optimizing the perplexity of text. We do not require any additional information such as named entity tags. In addition to improving language modeling performance, KALM learns to recognize named entities in an entirely unsupervised way by using entity type information latent in the model. On a Named Entity Recognition (NER) task, KALM achieves performance comparable with state-of-the-art supervised models. Our work demonstrates that named entities (and possibly other types of world knowledge) can be modeled successfully using predictive learning and training on large corpora of text without any additional information.Comment: NAACL 2019; updated to cite Zhou et al. (2018) EMNLP as a piece of related wor

arXiv.org e-Print Archive

Crossref

Conundrums in noun phrase coreference resolution: making sense of the state-of-the-art

Author: Riloff Ellen M.
Stoyanov Veselin
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2009
Field of study

Journal ArticleWe aim to shed light on the state-of-the-art in NP coreference resolution by teasing apart the differences in the MUC and ACE task definitions, the assumptions made in evaluation methodologies, and inherent differences in text corpora. First, we examine three subproblems that play a role in coreference resolution: named entity recognition, anaphoricity determination, and coreference element detection. We measure the impact of each subproblem on coreference resolution and confirm that certain assumptions regarding these subproblems in the evaluation methodology can dramatically simplify the overall task. Second, we measure the performance of a state-of-the-art coreference resolver on several classes of anaphora and use these results to develop a quantitative measure for estimating coreference resolution performance on new data sets

The University of Utah: J. Willard Marriott Digital Library

System for studying the parameters of gas solenoid valves

Author: Ivanov Zdravko
Mihaylov Veselin
Radev Radostin
Stoyanov Stoyan Nedelchev
Publication venue: 'Technical University of Varna'
Publication date: 30/12/2021
Field of study

The aim of the present work is to construct a test stand for determining the characteristics of different fourth generation gas injectors working under various conditions as close as possible to the actual operating ones. For this purpose, the standard fourth generation gas system and liquefied petroleum gas (LPG) as a working fluid were used for the stand. A system has been developed to maintain the gas leakage pressure equal in value to the pressure in the intake manifold of a Spark Ignition (SI) engine. Used LPG is compressed and liquefied for reuse. Additionally, safety measures are taken. The stand provides the right conditions for determining the influence of the nozzle diameter, the length of the connecting pipe between the injector and the intake manifold, the differential pressure upstream and downstream of the injector and other factors that affect these characteristics, which may be different when installing LPG system to an internal combustion engine

Annual Journal of Technical University of Varna

Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model

Author: Dettmers Tim
Li Xian
Lin Xi Victoria
Liu Zeyu Leo
Stoyanov Veselin
Publication venue
Publication date: 23/10/2023
Field of study

Large and sparse feed-forward layers (S-FFN) such as Mixture-of-Experts (MoE) have proven effective in scaling up Transformers model size for \textit{pretraining} large language models. By only activating part of the FFN parameters conditioning on input, S-FFN improves generalization performance while keeping training and inference costs (in FLOPs) fixed. In this work, we analyzed two major design choices of S-FFN: the memory block (a.k.a. expert) size and the memory block selection method under a general conceptual framework of sparse neural memory. Using this unified framework, we compare several S-FFN architectures for language modeling and provide insights into their relative efficacy and efficiency. We found a simpler selection method -- \textbf{\texttt{Avg-K}} that selects blocks through their mean aggregated hidden states, achieving lower perplexity in language model pretraining compared to existing MoE architectures including Switch Transformer (Fedus et al., 2021) and HashLayer (Roller et al., 2021).Comment: Accepted to EMNLP 202

arXiv.org e-Print Archive